Skip to content

feat(realtime): allow tools to opt out of automatic response.create after completion#3033

Closed
jawwad-ali wants to merge 1 commit intoopenai:mainfrom
jawwad-ali:feat/2971-tool-start-response-opt-out
Closed

feat(realtime): allow tools to opt out of automatic response.create after completion#3033
jawwad-ali wants to merge 1 commit intoopenai:mainfrom
jawwad-ali:feat/2971-tool-start-response-opt-out

Conversation

@jawwad-ali
Copy link
Copy Markdown
Contributor

Summary

Realtime sessions hard-coded `start_response=True` when sending tool outputs back to the model, so every tool unconditionally triggered a follow-up `response.create`. Side-effect tools (analytics, background-job schedulers, telemetry) had no way to stay silent after completion — exactly the gap @aligokalppeker raised in #2971 (comment):

"Response trigger at the end of the tool should be provided as an option to the SDK user; not every tool result is required to create a response."

This PR exposes that toggle on `FunctionTool` and the `@function_tool(...)` decorator. The transport layer (`OpenAIRealtimeWebSocketModel._send_tool_output`) already honors `RealtimeModelSendToolOutput.start_response`; this change just lets tool authors reach it.

What changes

  • `FunctionTool.start_response: bool = True` — new kw-only field (preserves positional-API compatibility per AGENTS.md).
  • `@function_tool(start_response=False)` — new kwarg on the decorator (and matching overloads).
  • `RealtimeSession._handle_tool_call` — for the function-tool success path, reads `func_tool.start_response` instead of hard-coding `True`.
  • `RealtimeSession._send_tool_rejection` — same treatment for approval-rejected outputs (the original tool's preference still applies).
  • The handoff path keeps `start_response=True` because the new agent must speak after a handoff.

Default remains `True`, so every existing tool keeps current behavior.

Scope note (re: #2971's race report)

Per @seratch's response in the same thread, `v0.14.2` already routes `response.create` through `_ResponseCreateSequencer` and the active-response race is gated until a concrete repro exists. This PR does not touch the sequencer or add any tracking counters. It only addresses the per-tool ergonomics raised in @aligokalppeker's follow-up, which is independent of the race-detection path.

Test plan

Verified locally on Windows (Python 3.13):

  • `uv run ruff format` — pass
  • `uv run ruff check` — pass (`All checks passed!`)
  • `uv run mypy src/agents/tool.py src/agents/realtime/session.py tests/test_function_tool.py tests/realtime/test_session.py` — pass (`Success: no issues found in 4 source files`)
  • `uv run pyright --project pyrightconfig.json src/agents/tool.py src/agents/realtime/session.py tests/test_function_tool.py tests/realtime/test_session.py` — pass (`0 errors, 0 warnings, 0 informations`)
  • `uv run pytest tests/test_function_tool.py tests/realtime/test_session.py -q` — 112 passed
  • Broader sweep `-k "tool or function or realtime"` — 1074 passed, 1 skipped. 4 unrelated pre-existing failures on clean `main` (`test_tracing_errors{,streamed}.py::test_tool_call_error`, `test_openai_realtime.py::TestSendEventAndConfig::test_interrupt*`) confirmed via `git stash` round-trip — not introduced by this change.

New tests (6 total):

Test Coverage
`test_function_tool_start_response_defaults_to_true` `FunctionTool` default value
`test_function_tool_start_response_can_be_set_to_false` Direct kw-only construction
`test_function_tool_decorator_threads_start_response` `@function_tool(start_response=False)`
`test_function_tool_decorator_default_start_response_is_true` Plain `@function_tool` keeps `True`
`test_function_tool_start_response_false_skips_response_create` Realtime session forwards `False` to transport
`test_function_tool_default_start_response_remains_true` Backward-compat for existing realtime tools

Issue number

Refs #2971

Checks

  • I've added new tests (if relevant)
  • I've added/updated the relevant documentation (`docs/realtime/guide.md` gained a short example for `@function_tool(start_response=False)`)
  • I've run `make lint` and `make format`
  • I've made sure tests pass

…fter completion

Realtime sessions hard-coded `start_response=True` when sending tool outputs
back to the model, so every tool unconditionally triggered a follow-up
`response.create`. Side-effect tools (analytics, background-job
schedulers, telemetry) had no way to stay silent after completion.

This adds a `start_response: bool = True` kw-only field to `FunctionTool`
and threads it through the `@function_tool(...)` decorator. The realtime
session then honors the field when emitting `RealtimeModelSendToolOutput`
for both successful tool execution and approval rejections; the handoff
path still triggers `response.create` because the next agent must speak.

The default stays `True`, so existing tools keep their current behavior.

Refs openai#2971
@github-actions github-actions Bot added documentation Improvements or additions to documentation enhancement New feature or request feature:realtime labels Apr 26, 2026
@seratch
Copy link
Copy Markdown
Member

seratch commented Apr 27, 2026

Thanks for the suggestion. I think having this flag on the function_tool decorator side may not be optimal in terms of the SDK design. Having a universal option only per realtime agent session may make sense, but I am not confident enough that adding such an option is really helpful for some use cases. We don't plan to add this option at least for now, so let us close this PR.

@seratch seratch closed this Apr 27, 2026
@aligokalppeker
Copy link
Copy Markdown
Contributor

aligokalppeker commented Apr 27, 2026

@seratch, what do you think about avoiding forcing the creation of responses when a tool is executed? Do we have to force response creation upon tool execution? Another valid scenario for tool execution is just to add a response to the bot context, not trigger a response.

In real-time voice, one might want a tool to save a user's/tool's data in the background without interrupting the audio stream.

This is a widely used concept with modern frameworks.

LangGraph treats agents as cyclical graphs (State Machines). Tool execution is just a node that modifies a global "State." It does not automatically trigger an LLM response unless you explicitly draw an edge to the LLM node.

CrewAI is built around Tasks, and given Tools to accomplish those tasks. The agent will execute tools internally in a loop without returning a conversational response to the user until the Task is marked complete.

Haystack ToolInvoker, as a component, is wired, connecting outputs to inputs. If you don't wire the ToolInvoker output back into a Generator (LLM), it doesn't generate a response.

Forcing an LLM response after every tool execution is an anti-pattern for Realtime and Agentic workflows. Modern frameworks treat tool execution as state mutation (LangGraph, Rasa), pluggable functions (Semantic Kernel), background tasks (CrewAI), or explicit pipeline components (Haystack). Tying tool execution strictly to dialogue generation introduces unavoidable latency and prevents silent background operations, which are critical for smooth Voice/Realtime UX.

Closing this PR w/o any alternative suggested out there is not a correct judgment. The framework around a model is an important aspect to make it usable.... Similar to this case, OpenAI really makes other models' jobs easier with such decisions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request feature:realtime

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants